***Note: replicators should set directory to open data 
*cd ""

 
* The first step is to download round 2018 OECD PISA data from  https://www.oecd.org/pisa/data/2018database/

*Notice that you need to download both the Student questionnaire data files  (597 MB) for individual-level variables and the School questionnaire data file (3.7 MB) for school-level variables.

*Notice that both dataset come in .sav format. Hence, the first step once downloaded is to turn them in .dta, which is done by the command
* import spss using "[dataset name].sav", clear


********** CLEANING DATA **********


* Begin from the school-level dataset, to be later merged, to be found here: 

 import spss using "[School dataset].sav", clear

*Harmonize country names

 kountry CNT, from(iso3c)  
 rename NAMES_STD country
 replace country="Kosovo" if country=="ksv"
 replace country="Baku" if country=="qaz"
 replace country="China" if country=="qci"
 replace country="Moscow" if country=="qmr"
 replace country="Tatartsan" if country=="qrt"
 replace country="Chinese Taipei" if country=="tap"

*Rename some variables, later used as macro-level variables in the multilevel model

rename SC001 Urban

g Size=SC003

g Private=0
replace Private=1 if SC013Q01TA==2

g Single_sex=0
replace Single_sex=1 if SC002Q01TA==0 | SC002Q02TA==0

*Generate an additional measure of inclusion, mentioned in the paper

pca SC165Q01HA SC165Q02HA SC165Q03HA SC165Q04HA SC165Q05HA SC165Q06HA SC165Q07HA SC165Q08HA SC165Q09HA SC013Q01TA
predict Inclusion_school 
egen min_Inclusion_school  = min(Inclusion_school)
egen max_Inclusion_school = max(Inclusion_school)
replace Inclusion_school  = (Inclusion_school - min_Inclusion_school ) / (max_Inclusion_school  - min_Inclusion_school )
replace Inclusion_school=1-Inclusion_school

save School.dta, replace

*now move to student's dataset

clear all
import spss using "[name of dataset].sav", clear

* standardize country names 
kountry CNT, from(iso3c)  
rename NAMES_STD country
 replace country="Kosovo" if country=="ksv"
 replace country="Baku" if country=="qaz"
 replace country="China" if country=="qci"
 replace country="Moscow" if country=="qmr"
 replace country="Tatartsan" if country=="qrt"
 replace country="Chinese Taipei" if country=="tap"

*Generate dummy identifying same sample as for micro-level analysis in 4.1 as well as one for the whole OECD sample - indeed, Pisa has some non-OECD countries

 g EU=0

replace EU=1 if CNTRYID==276 | CNTRYID==40  | CNTRYID==300  | CNTRYID==352  | CNTRYID==372  | CNTRYID==380  | CNTRYID==724  | CNTRYID==756  | CNTRYID==620 | CNTRYID==56 | CNTRYID==208 | CNTRYID==246  | CNTRYID==250 | CNTRYID==826  | CNTRYID==528 | CNTRYID==578 | CNTRYID==752
 replace EU=1 if country=="Czech Republic"  | country=="Slovakia" | country=="Poland" | country=="Lithuania" | country=="Estonia" | country=="Poland" | country=="Slovenia" | country=="Russia" 

 g oecd=0
 replace oecd=1 if country=="Australia" | country=="Austria" | country=="Belgium" | country=="Canada" | country=="Chile" | country=="Colombia" | country=="Czech Republic" | country=="Denmark" | country=="Estonia" | country=="Finland" | country=="France" | country=="Germany" | country=="Greece" | country=="Hungary" | country=="Iceland" | country=="Ireland" | country=="Israel" | country=="Italy" | country=="Japan" | country=="South Korea" | country=="Latvia" | country=="Lithuania" | country=="Luxembourg" | country=="Mexico" | country=="Netherlands" | country=="New Zealand" | country=="Norway" | country=="Poland" | country=="Portugal" | country=="Slovakia" | country=="Slovenia" | country=="Spain" | country=="Sweden" | country=="Switzerland" | country=="Turkey" | country=="United Kingdom" | country=="United States"
 
*generate subgroup of first or second generation immigrant
 
decode LANGN, generate(LANGN_str)
g Minority=0
replace Minority=1 if IMMIG>1
replace Minority=1 if IMMIG==.

*Main dependent variables, scaled between 0-1 for both discriminatory attitudes and perceptions of discrimination

g D=ATTIMM
replace D=. if Minority==1
egen min_D = min(D)
egen max_D = max(D)
replace D = (D - min_D) / (max_D - min_D)
replace D=1-D
 
g PD=DISCRIM
replace PD=. if Minority==0
egen min_PD = min(PD)
egen max_PD = max(PD)
replace PD = (PD - min_PD) / (max_PD - min_PD)


*Main Independent variables


*Cognitive sophistication

g Cog=GCSELFEFF
egen min_Cognitive = min(Cog)
egen max_Cognitive = max(Cog)
replace Cog = (Cog - min_Cognitive) / (max_Cognitive - min_Cognitive)

*Contact
 
g Contact_school=.
replace Contact_school=1 if ST220Q02HA==1
replace Contact_school=0 if ST220Q02HA==2

*Inclusion (for rubustness check)

*for rubustness mention in footnote: see paper. First recoding variables, then taking pca, then creating school-level score

  replace ST221Q01HA =2-ST221Q01HA
  replace ST221Q03HA= 2-ST221Q03HA  
  replace ST221Q06HA= 2-ST221Q06HA
  replace ST221Q09HA =2-ST221Q09HA
  replace ST221Q11HA =2-ST221Q11HA
  replace ST221Q02HA= 2-ST221Q02HA
  replace ST221Q04HA= 2-ST221Q04HA
  replace ST221Q05HA= 2-ST221Q05HA
  replace ST221Q07HA= 2-ST221Q07HA
  replace ST221Q08HA= 2-ST221Q08HA
  
pca ST221Q01HA ST221Q02HA ST221Q03HA ST221Q04HA ST221Q05HA ST221Q06HA ST221Q07HA ST221Q08HA ST221Q09HA ST221Q11HA 
predict InclusionPCA 
sort InclusionPCA 
  
*generate and rescale 0-1
egen Inclusion2  = mean(InclusionPCA), by(CNTSCHID)
egen min_Inclusion= min(Inclusion2)
egen max_Inclusion = max(Inclusion2)
replace Inclusion2 = (Inclusion2 - min_Inclusion) / (max_Inclusion- min_Inclusion)  

*Control variables

g Contact_area=. 
replace Contact_area=1 if ST220Q03HA==1  
replace Contact_area=0 if ST220Q03HA==2

g Contact_social =.
replace Contact_social=1 if ST220Q04HA==1
replace Contact_social=0 if ST220Q04HA==2

g Contact_family=.
replace Contact_family=1 if ST220Q01HA==1
replace Contact_family=0 if ST220Q01HA==2  

pca  PV1MATH PV1READ PV1SCIE
predict Pisa
egen min_Pisa = min(Pisa)
egen max_Pisa = max(Pisa)
replace Pisa = (Pisa - min_Pisa) / (max_Pisa- min_Pisa)
  
g Edu_mum=(5-ST005Q01TA)/4

rename ST001D01T Education

 
rename AGE Age
 
g Female=1 if ST004D01T==1
replace Female=0 if ST004D01T==2
 
*rename ISCEDL Edu

encode STRATUM, g(Region)

*In this merge, I used "School.dta" as created before, of course change the name of the dta accordingly if you had chosen another name
 
merge m:1 CNTSCHID using "School.dta" , keepusing(CNTRYID Urban Size  Urban Size  Single_sex Private SC167Q01HA) 
keep if _merge==3

g Inclusion=2-SC167Q01HA

save Pisa.dta, replace
 
global control_i "Pisa Female Edu_mum  WEALTH  Contact_social Contact_area  i.CNTRYID"
global control_s "Urban Single_sex Size  Private i.Education i.ISCEDO"

  ****** DIRECT EFFECTS *********
  
quietly:  mixed D  Cog  Contact_school Inclusion  $control_i $control_s  || CNTSCHID: Cog Contact_school   if EU==1, mle nostderr
est store D_EU
estadd local CountryFE "Yes"
estadd local Individual controls "Yes"
estadd local  School controls "Yes"
*esttab D, keep(c.Cog#c.Contact c.Cog#c.Inclusion) star(* 0.10 ** 0.05 *** 0.01)
local N_EU=  e(N)
su D if e(sample), mean
loc Mean_EU:di%8.2fc r(mean)

quietly:  mixed D  Cog  Contact_school Inclusion  $control_i $control_s  || CNTSCHID: Cog  Contact_school    , mle nostderr
est store D_oecd
estadd local CountryFE "Yes"
estadd local Individual controls "Yes"
estadd local  School controls "Yes"
*esttab D, keep(c.Cog#c.Contact c.Cog#c.Inclusion) star(* 0.10 ** 0.05 *** 0.01)
local N_oecd=  e(N)
su D if e(sample), mean
loc Mean_oecd:di%8.2fc r(mean)


  grstyle init
 	 grstyle set legend 2,  nobox
	 grstyle set size 8pt: tick_label key_label
	 grstyle set size 12pt: heading
	 grstyle set size 10pt: subheading axis_title
	grstyle set color      "0 0 0"    "200 200 200" 
    grstyle set graphsize 16cm 13cm 
coefplot (D_EU, label( "ESS sample" "{it:N}=`N_EU'")) (D_oecd, label( "All" "{it:N}=`N_oecd'")) , keep(  Cog Contact_school Inclusion)  xline(0,  lpattern(dash)  )   msymbol(d)  levels(95) ciopts(recast(. rcap))   coeflabels( Cog= `""Cognitive" "sophistication"'   Contact_school= `""Intergroup" "contact"'  Inclusion= `""Liberal" "instruction"' , labsize(vsmall) notick labgap(3))   xtitle("Anti-immigration attitudes", size(medsmall)) ylabel(,angle(vertical) labsize(small)) yline(2.5,  lpattern(dot) ) xscale(r(-.2 .2)) xlabel(-.2(.1).2) legend(position(12) rows(1) nobox span )  note("Individual-level controls include gender, maternal and paternal education, country fixed effects." "School-level controls include student staff-ratio, urbanity status, financial/staffing issues, type of school" "(general or vocational) and grade, private or public status." "Outcome means: ESS`Mean_EU' |  All `Mean_oecd'." , size(vsmall) span)   
graph export "Graph/PisaD_direct.png", replace 
graph save "Graph/PisaD_direct", replace
  
  
  
quietly:  mixed PD  Cog Contact_school Inclusion $control_i $control_s  || CNTSCHID: Cog Contact_school    if EU==1, mle nostderr 
est store PD_EU
*esttab D, keep(c.Cog#c.Contact c.Cog#c.Inclusion) star(* 0.10 ** 0.05 *** 0.01)
local N_EU=  e(N)
su PD if e(sample), mean
loc Mean_EU:di%8.2fc r(mean)

quietly:  mixed PD Cog Contact_school Inclusion  $control_i $control_s  || CNTSCHID: Cog  Contact_school    , mle nostderr
est store PD_oecd
*esttab D, keep(c.Cog#c.Contact c.Cog#c.Inclusion) star(* 0.10 ** 0.05 *** 0.01)
local N_oecd=  e(N)
su PD if e(sample), mean
loc Mean_oecd:di%8.2fc r(mean)

 
coefplot (PD_EU, label( "ESS sample" "{it:N}=`N_EU'")) (PD_oecd, label( "All" "{it:N}=`N_oecd'")) , keep(Cog Contact_school   Inclusion)  xline(0,  lpattern(dash)  )   msymbol(d)  levels(95) ciopts(recast(. rcap))   coeflabels( Cog= `""Cognitive" "sophistication"'   Contact_school= `""Intergroup" "contact"'  Inclusion= `""Liberal" "instruction"' , labsize(vsmall) notick labgap(3))   xtitle("Discriminatory climate at school", size(medsmall)) ylabel(,angle(vertical) labsize(small)) yline(2.5,  lpattern(dot) ) xscale(r(-.2 .2)) xlabel(-.2(.1).2) legend(position(12) rows(1) nobox span )  note("Individual-level controls include gender, maternal and paternal education, country fixed effects." "School-level controls include student staff-ratio, urbanity status, financial/staffing issues, type of school" "(general or vocational) and grade, private or public status." "Outcome means: ESS`Mean_EU' |  All `Mean_oecd'." , size(vsmall) span)   
graph export "Graph/PisaPD_direct.png", replace 
graph export "Graph/PisaPD_direct.png", replace 
graph save "Graph/PisaPD_direct", replace
  
  
  ****** CROSS EFFECTS *********

quietly:  mixed D  c.Cog#c.Contact_school c.Cog#c.Inclusion  Cog  Contact_school Inclusion  $control_i $control_s  || CNTSCHID: Cog Contact_school  if EU==1, mle nostderr  
est store D_EU
estadd local CountryFE "Yes"
estadd local Individual controls "Yes"
estadd local  School controls "Yes"
*esttab D, keep(c.Cog#c.Contact c.Cog#c.Inclusion) star(* 0.10 ** 0.05 *** 0.01)
local N_EU=  e(N)
su D if e(sample), mean
loc Mean_EU:di%8.2fc r(mean)

quietly:  mixed D c.Cog#c.Contact_school c.Cog#c.Inclusion  Cog  Contact_school Inclusion  $control_i $control_s  || CNTSCHID: Cog   Contact_school   , mle nostderr
est store D_oecd
estadd local CountryFE "Yes"
estadd local Individual controls "Yes"
estadd local  School controls "Yes"
*esttab D, keep(c.Cog#c.Contact c.Cog#c.Inclusion) star(* 0.10 ** 0.05 *** 0.01)
local N_oecd=  e(N)
su D if e(sample), mean
loc Mean_oecd:di%8.2fc r(mean)
 
 
  grstyle init
 	 grstyle set legend 2,  nobox
	 grstyle set size 8pt: tick_label key_label
	 grstyle set size 12pt: heading
	 grstyle set size 10pt: subheading axis_title
	grstyle set color      "0 0 0"    "170 170 170" 
    grstyle set graphsize 23cm 13cm 
coefplot (D_EU, label( "ESS sample" "{it:N}=`N_EU'")) (D_oecd, label( "All" "{it:N}=`N_oecd'")) , keep( c.Cog#c.Contact_school c.Cog#c.Inclusion Cog Contact_school Inclusion)  xline(0,  lpattern(dash)  )   msymbol(d)  levels(95) ciopts(recast(. rcap))   coeflabels( c.Cog#c.Contact_school = `" "Cognitive"   " sophistication" "{bf:x} contact" "'   c.Cog#c.Inclusion = `" "Cognitive"   "sophistication" "{bf:x} instruction" "' Cog= `""Cognitive" "sophistication"'   Contact_school= `""Intergroup" "contact"'  Inclusion= `""Liberal" "instruction"'  , labsize(vsmall) notick labgap(3))  groups( c.Cog#c.Contact_school  c.Cog#c.Inclusion = "{bf:Cross-level effects}"  , nogap)   xtitle("Anti-immigration attitudes ", size(medsmall)) ylabel(,angle(vertical) labsize(small)) yline(2.5,  lpattern(dot) ) xscale(r(-.2 .2)) xlabel(-.2(.1).2) legend(position(12) rows(1) nobox span )  note("Individual-level controls include gender, maternal education, family wealth, social contact and country fixed effects." "School-level controls include student staff-ratio, urbanity status, size and type of school" "(general or vocational) and grade, private or public status." "Outcome means: European sample `Mean_EU' |  All `Mean_oecd'." , size(vsmall) span)   
graph export "Graph/PisaD.png", replace 
graph save "Graph/PisaD", replace
 
 esttab  D_EU    D_oecd  using "Table/Dpisa.tex", replace se star(* 0.1 ** 0.05 *** 0.01) obslast collabels(, none)   wrap  legend  gaps nolz  booktabs mlabel("ESS sample" "All"  )  nonumbers  cells(b(star fmt(3)  ) se(fmt(3) par  )) compress label title(Discriminatory attitudes at school.)   keep(c.Cog#c.Contact_school c.Cog#c.Inclusion  Cog Contact_school  Inclusion )   coeflabels(c.Cog#c.Contact_school   `" "Cognitive sophistication"   "{bf: x}" "contact" "'   c.Cog#c.Inclusion   `" "Cognitive sophistication"   "{bf: x}" "instruction" "' Cog  `""Cognitive" "sophistication"'   Contact  `" "Intergroup" "contact" "'   Inclusion  `" "Liberal" "instruction" "'  )        stats(  CountryFE   Individual_controls  School_controls N  , fmt(a1 a2 a3   %9.0fc  )  labels( `"Country FE"'  `"Individual_controls"' `"School"'   `"Estimator"' `"N.obs"'   ) ) 
  

quietly:  mixed PD c.Cog#c.Contact_school  c.Cog#c.Inclusion  Cog Contact_school Inclusion $control_i $control_s  || CNTSCHID: Cog  Contact_school  if EU==1, mle  nostderr
est store PD_EU
*esttab D, keep(c.Cog#c.Contact c.Cog#c.Inclusion) star(* 0.10 ** 0.05 *** 0.01)
local N_EU=  e(N)
su PD if e(sample), mean
loc Mean_EU:di%8.2fc r(mean)

quietly:  mixed PD c.Cog#c.Contact_school c.Cog#c.Inclusion  Cog Contact_school Inclusion  $control_i $control_s  || CNTSCHID: Cog Contact_school , mle nostderr
est store PD_oecd
*esttab D, keep(c.Cog#c.Contact c.Cog#c.Inclusion) star(* 0.10 ** 0.05 *** 0.01)
local N_oecd=  e(N)
su PD if e(sample), mean
loc Mean_oecd:di%8.2fc r(mean)

 
coefplot (PD_EU, label( "ESS sample" "{it:N}=`N_EU'")) (PD_oecd, label( "All" "{it:N}=`N_oecd'")) , keep(c.Cog#c.Contact_school c.Cog#c.Inclusion Cog Contact_school   Inclusion)  xline(0,  lpattern(dash)  )   msymbol(d)  levels(95) ciopts(recast(. rcap))   coeflabels(c.Cog#c.Contact_school = `" "Cognitive"   " sophistication" "{bf:x} contact" "'      c.Cog#c.Inclusion = `" "Cognitive"   "sophistication" "{bf:x} instruction" "' Cog= `""Cognitive" "sophistication"'   Contact_school= `""Intergroup" "contact"'  Inclusion= `""Liberal" "instruction"' , labsize(vsmall) notick labgap(3))  groups(c.Cog#c.Contact_school   c.Cog#c.Inclusion = "{bf:Interaction effects}"  , nogap)   xtitle("Discriminatory climate at school", size(medsmall)) ylabel(,angle(vertical) labsize(small)) yline(2.5,  lpattern(dot) ) xscale(r(-.2 .2)) xlabel(-.2(.1).2) legend(position(12) rows(1) nobox span )  note("Individual-level controls include gender, maternal and paternal education, country fixed effects." "School-level controls include student staff-ratio, urbanity status, financial/staffing issues, type of school" "(general or vocational) and grade, private or public status." "Outcome means: ESS sample `Mean_EU' |  All `Mean_oecd'." , size(vsmall) span)   
graph export "Graph/PisaPD.png", replace 
graph export "Graph/PisaPD.png", replace 
graph save "Graph/PisaPD", replace
 
 
  esttab  PD_EU  PD_oecd  using "Table/PDpisa.tex", replace se star(* 0.1 ** 0.05 *** 0.01) obslast collabels(, none)   wrap  legend  gaps nolz  booktabs mlabel("ESS sample" "All"  )   coeflabels(c.Cog#c.Contact_school   `" "Cognitive sophistication"   "{bf: x}" "contact" "'   c.Cog#c.Inclusion   `" "Cognitive sophistication"   "{bf: x}" "instruction" "' Cog  `""Cognitive" "sophistication"'   Contact_school  `" "Intergroup" "contact" "'   Inclusion  `" "Liberal" "instruction" "'  ) nonumbers  cells(b(star fmt(3)  ) se(fmt(3) par  )) compress label title(Discriminatory attitudes at school.)   keep(c.Cog#c.Contact_school c.Cog#c.Inclusion  Cog Contact_school  Inclusion )       stats(  CountryFE   Individual_controls  School_controls N  , fmt(a1 a2 a3 %9.0fc )  labels( `"Country FE"'  `"Individual_controls"' `"School"'   `"Estimator"' `"N.obs"'   ) ) 

 
*/
 
*With OECD
